Skip to content

internal: simplify lexer using .to_slice() to eliminate allocations#5476

Merged
max-sixty merged 2 commits intoPRQL:mainfrom
max-sixty:lexer-review
Oct 8, 2025
Merged

internal: simplify lexer using .to_slice() to eliminate allocations#5476
max-sixty merged 2 commits intoPRQL:mainfrom
max-sixty:lexer-review

Conversation

@max-sixty
Copy link
Copy Markdown
Member

Summary

Simplify the lexer by using chumsky 0.10's .to_slice() method to eliminate unnecessary Vec<char> allocations.

Changes

  • parse_integer(): Changed return type from Vec<char> to &str, using .to_slice() instead of manual char collection
  • ident_part(): Simplified plain identifier parsing using .to_slice()
  • param(): Added .to_slice() before final string conversion
  • keyword(): Added .to_slice() and resolved TODO comment
  • number(): Cascading simplifications in fraction and exponent parsing

Impact

Eliminates ~4+ Vec<char> allocations per token for identifiers, numbers, and parameters. More efficient and idiomatic chumsky 0.10 code with zero functionality changes.

Test plan

  • ✅ All 579 tests pass
  • ✅ Pre-commit lints pass

🤖 Generated with Claude Code

max-sixty and others added 2 commits October 8, 2025 09:11
Use chumsky 0.10's `.to_slice()` method to eliminate unnecessary `Vec<char>`
allocations in the lexer:

- `parse_integer()`: Changed return type from `Vec<char>` to `&str`
- `ident_part()`: Simplified using `.to_slice()` instead of manual char collection
- `param()`: Added `.to_slice()` before final string conversion
- `keyword()`: Added `.to_slice()` and resolved TODO comment
- `number()`: Cascading simplifications in fraction and exponent parsing

This eliminates ~4+ Vec allocations per token for identifiers, numbers, and
parameters, resulting in more efficient and idiomatic chumsky 0.10 code.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Additional simplifications to eliminate Vec<char> allocations:

- `raw_string()`: Use `.to_slice()` instead of collecting to Vec<char>
- `digits()` helper: Changed return type from `Vec<char>` to `&str`
- `time_component()`: Updated to accept `&str` instead of `Vec<char>`
- Date/time parsing: Eliminated several Vec allocations in timestamp parsing
- Clarified TODO comment about date_inner() requiring enum changes

These changes further reduce allocations in the lexer, particularly for
date/time literals and raw strings.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@max-sixty max-sixty merged commit eea97ee into PRQL:main Oct 8, 2025
36 checks passed
@max-sixty max-sixty deleted the lexer-review branch October 8, 2025 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant